Social Web Meets Sensor Web: From User-Generated Content to Linked Crowdsourced Observation Data
نویسندگان
چکیده
The reach of dominating social media like Facebook and Twitter in the current population is enormous, and these media have long been leveraged for diverse applications. In particular, for some citizen science projects, existing social media increasingly become platforms on which participants interact and contribute. These user contributions, often termed User-Generated Content (UGC), can be a mix bag of posts, comments, images, and other media. We report in this paper a work-in-progress in formalizing user contributions from a large Facebook group (more than 4,000 users) established for biodiversity observation. A major part of our work is to extract structured datasets with welldefined semantics from unstructured UGC collections. We use common vocabularies from Darwin Core (DwC), Friendof-a-friend (FOAF), Semantically-Interlinked Online Communities (SIOC), Semantic Sensor Network (SSN), among ∗This research is supported in part by the Ministry of Science and Technology (grant no. 102-2627-M-001-009) and by the Endemic Species Research Institute, Council of Agriculture, Taiwan. We are grateful to Te–En Lin and his group at the Endemic Species Research Institute for their help with the collected data. †Dong–Po Deng is also a PhD candidate at the Faculty of Geo–Information Science and Earth Observation (ITC), University of Twente. ‡Tyng–Ruey Chuang is also affiliated with the Research Center for Information Technology Innovation and the Research Center for Humanities and Social Sciences (Center for Geographic Information Science), both at Academia Sinica. This paper is released under the Creative Commons Attribution 4.0 License. You are free to share and adapt this paper for any purpose, even commercially, as long as you give appropriate credit, provide a link to the license, and indicate if changes were made. These freedoms cannot be revoked as long as you follow the license terms. For a copy of the license, please visit . Linked Data on the Web (LDOW2014), April 8, 2014. Seoul, Korea. . others, to formalize the extracted datasets, hence, make them readily linkable. A nice consequence of this approach is that a multi-faceted browser can be quickly built to explore biodiversity information in large collections of UGC.
منابع مشابه
Managing Quality of Crowdsourced Data
The Web is the central medium for discovering knowledge via various sources such as blogs, social media, and wikis. It facilitates access to contents provided by a large number of users, regardless of their geographical locations or cultural backgrounds. Such user-generated content is often referred to as crowdsourced data, which provides informational benefit in terms of variety and scale. Yet...
متن کاملDeveloping Knowledge Models of Social Media: A Case Study on LinkedIn
User Generated Content (UGC) exchanged [1] via large Social Network is considered a very important knowledge source about all aspects of the social engagements (e.g. interests, events, personal information, personal preferences, social experience, skills etc.). However this data is inherently unstructured or semi-structured. In this paper, we describe the results of a case study on LinkedIn Ire...
متن کاملA Web Mashup for Social Libraries
User-generated content on the Social Web is often locked within information silos. Inadequate APIs or, worst, the lack of APIs obstruct reuse and prevent the opportunity to integrate similar content from different communities. In this paper we present a Web mashup which combines information from different social libraries. Aggregated information, including both classic book metadata and user-ge...
متن کاملData Extraction using Content-Based Handles
In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...
متن کاملSpecial issue on real-time and ubiquitous social semantics
The Web has shifted from its initial document and librarian paradigm to an ecology of socially-generated data and services. Websites such as Twitter, Facebook, and FourSquare, emphasise the huge popularity of sharing information in real-time. In addition, the wealth and breadth of applications that exploit open social networking APIs to provide new services and functionalities are growing rapid...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014